Data specifics

  • The Longitudinal Employer Household Dynamics (LEHD) program at the US Census Bureau releases the Origin Destination Employment Statistis (LODES) datasets annually based on employer-employee insurance records.
  • This datafile uses data from the Origin-Destination (OD) data files from LEHD. The OD datafile lists each pair the census blocks for where workers live and work, enabling us to calculate the average commute distance by calculating the distance between each home and workplace census block pairing.
  • Distance calculations: All distances are "as the crow flies" and calulated using the Vincenty Ellipsoid method based on the latitude and longitudes of the centroids of each census block. These distances are then aggregated to the census block group and tract level.
  • Data presented here are from 2018 and spatial units are based on the 2010 census. As of July of 2021, 2018 is the most recent year for which data are available. The earliest year for which data are available is 2002.
  • The data contains average and median commute distances for each SU calculated based on the following groups: (1) People who live and work in the Eastern Shore region; (2) Eastern Shore residents who work within 40 miles of their home census block; (3) Eastern Shore residents who work in a tract outside the Eastern Shore that employs at least 25 Eastern Shore residents; (4) All Eastern Shore area residents represented in the LODES OD data. The data also contains the number of residents within each SU who fall into those 4 categories.
  • Some limitations:
    • The data are prone to imperfect geocoding for certain jobs; jobs for companies with multiple branches are often all coded in the same location. This means that distance calculations are likely to be an overestimate if many residents within one SU are employed by a company with multiple branches or a company whose headquarters is far away. There is also no way to differentiate between remote workers or the frequency with which any worker actually travels to their place of emplyoment (though note: these data were collected prior to the COVID-19 pandemic when fewer people were working remotely). For these reasons, we include calculations of average and median commute distances based on multiple groups of workers. The estimates based on all residents in an SU are most likely to be an overestimate, while those based on residents working within 40 miles of home are likely to be the most conservative.
    • The distances are "as the crow flies" and therefore imprecise estimates of actual commute distances on the road.
    • These data do not include workers in defense-related industries.
    • Student-workers are unlikely to be represented in these data because their jobs are not typically covered by state unemployment insurance.

Variable descriptions

meta %>% 
  filter(su_blkgp == 1) %>%
  select(varname, about) %>% as.list()
## $varname
##  [1] "blkgroup"             "county"               "avgc_allblk"         
##  [4] "medc_allblk"          "commutersInBlgr"      "avgc_within40blk"    
##  [7] "medc_within40blk"     "commuterw40blk"       "avgc_25_employeesblk"
## [10] "medc_25_employeesblk" "commuter25blk"        "avgc_workinRegionblk"
## [13] "medc_workinRegionblk" "commuterinRegionblk" 
## 
## $about
##  [1] "12-digit census block group code"                                                                                                                                                     
##  [2] "5-digit county code"                                                                                                                                                                  
##  [3] "Average \"as the crow flies\" commuting distance for all residents of the census block group"                                                                                         
##  [4] "Median \"as the crow flies\" commuting distance for all residents of the census block group"                                                                                          
##  [5] "The number of residents in each census block group who are represented in the data"                                                                                                   
##  [6] "Average \"as the crow flies\" commuting distance for residents of the census block group who work within 40 miles"                                                                    
##  [7] "Median \"as the crow flies\" commuting distance for residents of the census block group who work within 40 miles"                                                                     
##  [8] "The number of residents in the census block group who work within 40 miles of home"                                                                                                   
##  [9] "Average \"as the crow flies\" commuting distance for residents of the census block group who commute to a census tract that employs at least 25 residents from the region of interest"
## [10] "Median \"as the crow flies\" commuting distance for residents of the census block group who commute to a census tract that employs at least 25 residents of the region of interest"   
## [11] "The number of residents in the census block group who commute to a census tract that employs at least 25 residents of the region of interest"                                         
## [12] "Average \"as the crow flies\" commuting distance for residents of the census block group who work in the same region as where they live"                                              
## [13] "Median \"as the crow flies\" commuting distance for residents of the census block group who work in the same region as where they live"                                               
## [14] "The number of residents in the census block group who commute to work within the region of interest"
glimpse(lodes)
## Rows: 43
## Columns: 14
## $ blkgroup             <dbl> 510010901001, 510010901002, 510010901003, 5100109…
## $ medc_allblk          <dbl> 45.87512, 45.43606, 37.65439, 41.07406, 43.54075,…
## $ medc_workinRegionblk <dbl> 8.998906, 7.036666, 8.178206, 8.454102, 8.173306,…
## $ medc_within40blk     <dbl> 8.370839, 6.173518, 7.641535, 6.434638, 8.382186,…
## $ medc_25_employeesblk <dbl> 13.525101, 18.134192, 21.045550, 11.301578, 19.61…
## $ commutersInBlgr      <int> 429, 370, 288, 261, 263, 972, 657, 276, 285, 339,…
## $ commuterinRegionblk  <int> 298, 250, 205, 180, 168, 647, 432, 156, 167, 202,…
## $ commuterw40blk       <int> 291, 248, 202, 176, 170, 643, 437, 163, 173, 210,…
## $ commuter25blk        <int> 323, 286, 234, 202, 201, 746, 492, 177, 196, 233,…
## $ avgc_allblk          <dbl> 45.60011, 44.03670, 39.39414, 42.69355, 45.55925,…
## $ avgc_workinRegionblk <dbl> 9.695082, 7.122045, 8.307257, 8.208582, 8.390502,…
## $ avgc_within40blk     <dbl> 8.820367, 6.808916, 7.761259, 7.403881, 8.730307,…
## $ avgc_25_employeesblk <dbl> 16.67561, 18.71414, 18.45512, 19.53141, 22.44348,…
## $ county               <int> 51001, 51001, 51001, 51001, 51001, 51001, 51001, …
lodes %>% select(avgc_allblk, avgc_within40blk, avgc_25_employeesblk, avgc_workinRegionblk, medc_allblk, medc_within40blk, medc_25_employeesblk, medc_workinRegionblk) %>% 
  select(where(~is.numeric(.x))) %>% 
  as.data.frame() %>% 
  stargazer(., type = "text", title = "Summary Statistics", digits = 2,
            summary.stat = c("mean", "sd", "min", "median", "max"))
## 
## Summary Statistics
## =======================================================
## Statistic            Mean  St. Dev.  Min  Median  Max  
## -------------------------------------------------------
## avgc_allblk          44.36  13.69   29.55 42.67  106.11
## avgc_within40blk     10.12   3.43   6.81   9.43  22.27 
## avgc_25_employeesblk 18.57  10.90   9.58  16.53  66.99 
## avgc_workinRegionblk 8.30    3.23   5.11   7.62  20.23 
## medc_allblk          41.92  14.29   24.75 38.71  104.27
## medc_within40blk     9.95    3.94   5.54   9.07  24.04 
## medc_25_employeesblk 16.03  11.14   7.90  13.56  63.29 
## medc_workinRegionblk 8.30    3.66   4.57   7.42  24.04 
## -------------------------------------------------------

Visual distribution

long <- lodes %>% select(c(blkgroup, avgc_allblk, avgc_within40blk, avgc_25_employeesblk, avgc_workinRegionblk, medc_allblk, medc_within40blk, medc_25_employeesblk, medc_workinRegionblk)) %>% 
  pivot_longer(-blkgroup, names_to = "measure", values_to = "value")
long$measure <- factor(long$measure,
                         levels = c("avgc_allblk", "medc_allblk", "avgc_within40blk", "medc_within40blk", "avgc_25_employeesblk", "medc_25_employeesblk", "avgc_workinRegionblk", "medc_workinRegionblk"))
long %>% 
  ggplot(aes(x = value, fill = measure)) +
  scale_fill_viridis(option = "plasma", discrete = TRUE, guide = FALSE) +
  geom_histogram() + 
  facet_wrap(~measure, scales = "free", ncol = 2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

meta %>% 
  filter(varname %in% c("avgc_allblk", "medc_allblk", "avgc_within40blk", "medc_within40blk", "avgc_25_employeesblk", "medc_25_employeesblk", "avgc_workinRegionblk","medc_workinRegionblk")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_allblk: Average "as the crow flies" commuting distance for all residents of the census block group"
[2] "medc_allblk: Median "as the crow flies" commuting distance for all residents of the census block group"
[3] "avgc_within40blk: Average "as the crow flies" commuting distance for residents of the census block group who work within 40 miles"
[4] "medc_within40blk: Median "as the crow flies" commuting distance for residents of the census block group who work within 40 miles"
[5] "avgc_25_employeesblk: Average "as the crow flies" commuting distance for residents of the census block group who commute to a census tract that employs at least 25 residents from the region of interest" [6] "medc_25_employeesblk: Median "as the crow flies" commuting distance for residents of the census block group who commute to a census tract that employs at least 25 residents of the region of interest"
[7] "avgc_workinRegionblk: Average "as the crow flies" commuting distance for residents of the census block group who work in the same region as where they live"
[8] "medc_workinRegionblk: Median "as the crow flies" commuting distance for residents of the census block group who work in the same region as where they live"

Mapping the data

All Eastern Shore region workers

pal <- colorNumeric("plasma", reverse = T, domain = east_lodes$avgc_allblk)
leaflet(east_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = east_lodes,
              fillColor = ~pal(avgc_allblk),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", east_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(east_lodes$avgc_allblk, 2))) %>% 
  addLegend("bottomright", pal = pal, values = east_lodes$avgc_allblk, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_allblk")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_allblk: Average "as the crow flies" commuting distance for all residents of the census block group"

Eastern Shore region workers who work within 40 miles of home

pal <- colorNumeric("plasma", reverse = T, domain = east_lodes$avgc_within40blk)
leaflet(east_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = east_lodes,
              fillColor = ~pal(avgc_within40blk),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", east_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(east_lodes$avgc_within40blk, 2))) %>% 
  addLegend("bottomright", pal = pal, values = east_lodes$avgc_within40blk, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_within40blk")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_within40blk: Average "as the crow flies" commuting distance for residents of the census block group who work within 40 miles"

Eastern Shore region workers who work in a tract that employs >=25 Eastern Shore region residents

pal <- colorNumeric("plasma", reverse = T, domain = east_lodes$avgc_25_employeesblk)
leaflet(east_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = east_lodes,
              fillColor = ~pal(avgc_25_employeesblk),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", east_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(east_lodes$avgc_25_employeesblk, 2))) %>% 
  addLegend("bottomright", pal = pal, values = east_lodes$avgc_25_employeesblk, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_25_employeesblk")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_25_employeesblk: Average "as the crow flies" commuting distance for residents of the census block group who commute to a census tract that employs at least 25 residents from the region of interest"

Eastern Shore region only

pal <- colorNumeric("plasma", reverse = T, domain = east_lodes$avgc_workinRegionblk)
leaflet(east_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = east_lodes,
              fillColor = ~pal(avgc_workinRegionblk),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", east_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(east_lodes$avgc_workinRegionblk, 2))) %>% 
  addLegend("bottomright", pal = pal, values = east_lodes$avgc_workinRegionblk, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_workinRegionblk")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_workinRegionblk: Average "as the crow flies" commuting distance for residents of the census block group who work in the same region as where they live"